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ABSTRACT 

While debates over the heritability of IQ and the 
potential for culture bias in measuring instruments have generated 
much research and public comment^ it is also possible to investigate 
the significance of interracial differences in mean IQ by ignoring 
both the foregoing issues and instead examining the social psychology 
of the test situation itself. Male and female students between 12 and 
16 years of age completed the Wechsler Intelligence Scale for 
Children Performance sub-scales in a variety of settings. The 
variables of test atmosphere (evaluative or gawelike) , tester 
expectation (high or low) , race of tester (black or white) , and race 
of subject were placed in a two by two by two by two factorial 
design. At a second session some weeks after taking the WISC^ 
subjects completed a group administered questionnaire. The pattern of 
mean IQ scores as well as mood and personality data indicated that 
test performance was optimal at moderate levelii^ of motivational 
arousal. A replication of the experiment for male subjects increased 
cell sizes to the point that socio-economic status could be treated 
as an independent variable in the design. When this was done^ the 
results suggested that interracial differences in mean IQ might be 
erased depending upon the social^psychological characteristics of the 
test setting and the socio-economic background of the testee. 
(Author/JM) 
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ABSTRACT 



Male and female students between 12 and 16 years of age completed 
the Wise performancG subscalrs in a variety of settings. The variables 
of test atmosphere (evaluative or gamelike), tester expectation (high 
or low), race of tester (black or white), and race of subject were placed 
in a 2x2x2x2 factorial design. The pattern of mean IQ scores as well 
as mood and personality data indicated that test performance was optimal 
at moderate levels of motivational arousal. A replication of the experi- 
ment for male subjects increased cell sizes to the point that socio- 
economic status could be treated as an independent variable in the design. 
When this was done, the results suggested that interracial differences 
in mean IQ might be erased depending upon the social-psychological character- 
istics of the test setting and the socio-economic background of the 
testes. 
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rootivation , Race, Social Class, and IQ 

If an IQ test is administered to children attending a racially inte- 
qrated school, blacks will generally average from 11 to 15 points below 
whites (though there will be a substantial overlap in the two distributions). 
Jensen* s (1969) suggested explanation for this phenomenon was that, since 
IQ has a very high heritabil ity , large mean differences between racial 
groups must be predominantly genetic in origin. Environmentalists (e.g*, 
Garcia, 1972) have replied that any such interracial differences are entirely 
attributable ta an alleged bias in the content of IQ tests in ^'avor of 
cultural experiences more readily accessible to middle and upper income 
whites* 

While debates over the heritability of IQ and the potential for culture 
bias in measuring instruments have generated much research and public 
comment, it is also possible to investigate the significance of interracial 
differences in mean IQ by ignoring both the foregoing issues and instead 
examining the social psychology of the test situation itself. Sattler*s 
(1970) review of the extensive literature on this topic found support 
for several hypotheses (among others): black children may have generally 
lower achievement motivation than whites; expectations of failure or fear 
of appearing "uppity** may impair the performance of black children when 
they anticipate comparison with whites; the performance of black children 
may be improved by providing a same-race tester; the performance of all 
children may be improved when the tester has a favorable rather than an 
unfavorable expectation of their ability [when overtly revealed, such 
expectations may result in what Rosenthal (1965) has called an "experimenter** 

4 



or instructional effect]. Si-\ce black students have in the past usually 
been tested by white examiners in what are often competitive, ego-threatening 
situations, it is conceivable -if black examinees tend to believe that 
practically any white tester will be prejudiced against them— that performance- 
debilitating self-fulfilling prophecies could be set in motion (Rosenthal 
and Jacobsen, 1968) • 

However, the influen^2 of the test setting on the observed ability 
of black and white testet-s is not as consistently predictable as the preceding 
paragraph implies* Katz (1970), o prominent researcher in this area, has 
found, rather paradoxically, that the performance of black students often 
improves when they are tested by a white rather than by a black examiner, 
when they are in the presence of white rather than black agemates, or when 
they are to be compared against white rather than black norms. According 
to Katz, this may occur because whites are regarded as evaluators that one 
should try harder to impress or because comparison with white standards 
is more informative for self-evaluations of ability. 

A study with hybrid results was recently reported by McClelland (1974). 
He found that both black and white subjects were more cooperative and 
motivated to achieve when a white rather than a black interviewer asked 
them to complete a battery of itpms from intelligence and personality 
tests. But both groups scored higher in intelligence in the presence of 
a black interviewer, perhaps, as PHcClelland suggests, because the white 
tester stimulated evaluation apprehension along with achievement motivation, 
thus producing ••lower intelligence test scores, due to higher anxiety." 

It may in fact be the case that the single construct of testee motivation 
(one element of which is test anxiety) could account for this complex array 
of evidence. Wine (1971) suggested that anxiety elicits ''two classes of 
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responses! those related to tas^k completion, which are anxiety reducing, 
and those which interfere with task completion." When anxiety is extremely 
high, the performer's state of internal erousal becomes disruptively 
distracting, which causes the interfering responses to preoominate and 
which in turn debilitates performance. Sarason (1961) found that under 
conditions of ego threat habitually low test anxious subjects will surpass 
those whose performance has been debilitated by hiqh test anxiety. Sarason 
also found, however, that under relaxed, non-threatening conditions, subjects 
low in test anxiety solved fewer anagrams than their habitually high test 
anxious counterparts; presumably, the latter were experiencing only a moderate, 
optimally motivating level of arousal due to the relaxed conditions whereas 
the former were not sufficiently motivated to complete the task. Weiner 
and Samuel (in press) administered Sarason's anagrams in an ego-threatening 
environment to chronically high test anxious subjects, some of whom were 
led to mislabel their test-induced physiological symptoms of anxiety as 
being due to the side effects of a capsule (placebo) which they had swallowed 
earlier. This group rated itself less anxious and was able to solve more 
anagrams than controls not given the opportunity to mislabel the source 
of their symptoms of internal arousal. 

Such processes could account for the seemingly contradictory effects 
for race of tester on testee performance which have emerged in the literature. 
A white tester may be perceived as g more powerful evaluator than a black 
counterpart and so will elicit a better performance from testees so long 
as the greater internal arousal associated with his presence goes no higher 
than the optimal, moderate level. Certain environments may, however, induce 
some anxiety or arousal independently of the characteristics of the examiner. 
Under these conditions, the presence of a white evaluator could be excessively 



aruusinq and so mioht rf.'suit in pRiformance impairment; instead, a less 
disruptively stimulatinq black evaluator might be able to elicit a superior 
performance. 

In the present experiment, race of subject and race of tester were 
systematically varied. In addition, there were two test atmospheres such 
that students in one condition were explicitly told they were completing a 
battery of competitive tests while those in another condition thougnt they 
were working with a set of creative games and playthinqs. The tester also 
expressed either a high or a low expectation for the subject's probable 
performance* In all cells of this 2x2x2x2 factorial pairing of race of 
tester, race of subject, test atmosphere, and tester expectation, subjects 
completed the performance subscales to the Wechsler Intelligence Scale 
for Children, 

It was anticipated that whites would, overall, score 11-15 points 
higher in IQ than blacks, since this appears to be a stable finding in the 
literature. For subjects of both races, however, it was predicted that in 
the evaluative atmosphere the ego-threatening and competitive nature of 
the instructions would induce a state of fairly high anxiety in testees 
and that if the tester then expressed a high expectation this might reduce 
anxiety slightly to a more moderate level and so facilitate performance; if 
the tester Instead expressed a lo w expectation in the evaluative atmosphere, 
the added stress of his critici^.rp should definitely debilitate performance. 
In the gamelike atmosphere, by contrast, the setting was anticipated to be 
so relaxed that testee motivation would be insufficient for optimal performanc 
Here it was predicted that achievement would be facilitated rather than 
debilitated by the moderate anxiety induced by a tester's low expectation. 

It was further speculated that there might be situations in which 
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reactance motivation could override the debilitative effects of high test 
anxiety. Brehm's (1966) theory of reactance states that when people feel 
their freedom of action is being threatened by manipulation or coercion 
they u/ill resist the threat and seek to emphasize their freedom to behave 
oppositely. In the present research, it was felt that if testees in the 
ego-threatening evaluative atmosphere were challenged by a low expectation 
on the part of an opposite-race tester, an especially strong desire to 
disprove the tester's negative assessment might lead to an effort at 
suppressing task- interfering responses to permit a resolute concentration 
on task completion. Baron and Ganz (1972) have suggested that reactance 
motivation might be especially likely to be aroused in black students 
confronted by a white evaluator, and Allen. Oubanoski, and Stevenson 
(1966) have reported that among older children criticism from a white 
experimenter was actually more effective than praise in maintaining the 
performance level of black testees. 

To summarize, it was anticipated that in the evaluative atmosphere 
subjects would perform better on the WISC following an expressed high 
expectation on the part of the tester (an instructional effect). In the 
gamelike atmosphere, however, it u,a3 predicted that performance would be 
optimal following a tester's low expectp.cion ; in the sense that the subject 
u^as predicted to behave oppositely from the tester's overt "demand." this 
might be called a reactance effect, though other motivational states, such 
as anxiety or irritation resulting from the tester's criticism, were 
also expected to contribute to the phenomenon. For students in the 
evaluative atmosphere, it was speculated that an opposite-race tester's 
low expectation might be viewed as a challenge; if so, it might arouse an 
especially strong motivation to disconfirm the tester's negative assessment, leading 
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to a reactance «ffect rather than the instructional effect which was 
otheriuise predicted for the evaluative atmosphere. Overall, it was 
anticipated that if performance responded as predicted to the manipulation 
of the testee's motivational state, the size of interracial differences in 
mean IQ would be found to be somewhat more flexible than was suggested 
by Jensen's (1969) review of the literature. 

EXPERIMENT I 



method 
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During 1972-73, the WISC performance measures were administered to 
208 black and 208 white junior high and high school students between 12 and 
16 years of age, equally divided by sex. The 2x2x2x2x2 factorial design 
varied test atmosphere (evaluative or gamelike), tester expectation (high 
or low), race of tester (black or white) and race and sex of subject. 

Arriving at an office provided by the school, the subject encountered 
Experimenter^ of a team of two male experimenters. One team consisted of 
two white and the other of two black experimenters. Experimenter^ described 
himself as a representative of Psychology Incorporated, a company which 
manufactures either "tests of intelligence and mental capacity" (evaluative 
atmosphere) or "creative games and playthings" (gamelike atmosphere). In 
the evaluative condition the tester further declared that the subject's 
performance on the tests would be compared to that of other students at 
the school and against city and naticn«,ide norms. In the gamelike condition. 
Experimenter^ assured the subject that performance on the tasks was the 
subject's "own thing" and that he should relax and take it easy since 
"no one is going to be compared to anyone else here." To supplement these 
manipulations, the tester wore a tie and jacket in the evaluative conditions 
but rtimoved the jacket and loosened the tie when the atmosphere was to be 
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gamelike. All experimenters .ere in their middle or late twenties. 

Experimenter^ revealed that Psychology Incorporated had reviewed the 
student's grades in "think.' courses like math. English, or art as well as 
"action" courses li^e gym. shop, or home economics to arrive at a 
prediction for the subject's performance. Those in the high expectation 
conditions were told they could expect to have an easy time with the tasks 
.hile those in the low expectation treatments were told they would probably 

have a difficult time. 

The student was then niven the Object Assembly subtest of the WISC 
In the evaluative atmosphere, an imposing interval timer was used to score 
performance on the items; in the gamelike atmosphere, the examiner used a 
.all clock and explained that he regretted having to time t^^^ivities 
but must try to keep things on schedule since other students .ould be 
arriving later. When the subtest .as completed. Experimenter^ reinforced 
the expectation manipulation by announcing that the subject had either done 
rather well, above the average, or rather poorly, below the average. He 

then explained that his partner, l^r. . . had a few other tests 

(or games) for the student to .ork with. As he departed. Experimenter^ 
removed the Interval timer from the table in the evaluative atmosphere 
or put on his jacket and straightened his tie in the gamelike condition; 
he was then replaced by Exper imenter^. who was blind as to the subject's 
prior treatment. Experimenter^ admirdstered the Picture Arrangement. 
Picture Completion. Block Design, and Coding subtests. 

After completing the WISC. the subject made a self-rating of performance 
on a V.unit scale running from "Very poorly" to "Very well." Next, he or 
She filled out a mood adjective checklist (Nowlis. 1965). The adjectives 
on the Checklist comprise scales for measuring aggression (e.g.. "angry"). 

10 



-10- 

anxiety (••clutched up^^), concentration ("engaged in thought^^), egotism 
(••boastf □!••), elation (••over joyed'O, fatigue ("tired^Oi sadness ("sorry^^), 
skepticism (••suspicious") i and surgency (••playful*^) . For each adjective, 
the subject indicated the degree to which it described his or her feelings 
on a 4-.unit scale running from "definitely not^^ to ••definitely ••• Finally, 
subjects were asked their home address and the occupations of their parents. 
Parents' occupations were referred to the tables of ranked occupational 
categories in Duncan, Featherman, an J Duncan (1972). The addresses were 
referred to census tracts as another measure of social class* The combined 
social class index ranged from 0 to 100. 

At a second session some weeks later, subjects completed a group- 
administered questionnaire. This included a fully validated version of 
Rotter^s internal-external scale adapted for use with children by Nowicki 
and Strickland (1973). According to Rotter (1966) "internals^' generally 
believe they have co.itrol over the events which occur in their lives 
while ••externsls^^ believe their fates are decided by powerful deliberate 
or circumstantial forces beyond their control. Higher scores on the I-E 
scale arc associated with greater externality. Also completed was the 
Marlowe-Crowne Social Desirability Scale (Crowne and Marlowe, 1960), 
which measures the strength of the subject's need for approval from 
others. In addition, two subscales for general and test-specific anxiety 
from Janis and Field (1959) were included on the questionnaire. 

The last items on the questionnaire specifically tapped attitudes 
tovjard women and blacks. One item for each target group asked the subject 
to indicate the frequency with which he or she thought the group had 
encountered discrimination, on a scale running from ••never^^ to ••extremely 
often." A second item asked the degree to which the subject felt women 
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and blacks should oppose discrim jnii ion when and if it occurred, from 
"Always should relax and go along'' to ''Always stand up aggressively." 
The third item asked for the degree to which the subject agreed with 
whatever he or she felt was meant by "black Is beautiful" or "women's 
liberation," respectively. Scores on the second and third items were 
combined for each target group into indices labeled "black is beautiful/' 
for attitudes toward black assertiveness, and "women's liberation," for 
attitudes toward female assertiveness. Each of the three items was 
rated on a six-unit scale. 

Since the subjects were students under 16 years old, especial care 
was taken to see that each participant left the first session in a pleasant 
frame of mind. Particularly in the low expectation treatments, the 
experimenters emphasized that they had really wanted to study the affect 
of a person's mood on his or her test performance and that in order to 
accomplish this it had been necessary for the tester to exaggerate some 
of the things he had said about their abilities. Subjects were reassured 
as to the quality and complete confidentiality of their own performance, paid 
$3.00, sworn to secrecy, and released. One index of subject satisfaction 
is the degree to which they maintained silence. A probe for prior knowledge 
was conducted both before and during each debriefing; it did not prove 
necessary to discard any rubject for suspicion induced by a prior parti- 
cipant's breach of confidence. 

RESULTS 

Success of the Experimental Manipulations 

Self-ratings of performance were considerably more positive for 
subjects in the high than in the low expectation conditions (£(1,412) = 
235.28, p<.00i). Responses to the mood checklist also tended to confirm 
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the success of the atmosphere as well as the expectation manipulations. 
Elation was greater in the high expectation treatment than in the low 
(£(1,412) = 4,93, p<:.05) and was also n the gamelike than in 

the evaluative atmosphere (£(1,412) = Ib.'^^, p<c:,01). Anxiety was greater 
in the evaluative than in the gamelike atmosphere (£(1,412) = 7,88, p<:,01), 
iq Data 

The IQ scores reported below are derived from the four subtests 
administered by Exper imenter^* prorated according to procedures in the 
Wise scoring manual (Wechsler, 1949). The prorating caused cal ^lated 
IQs to be slightly higher t-.an they would have been if the discarded 
Object Assembly score had instead been included and no prorating applied* 

Table 1 and Figure 1 show the mean IQ measured in each cell of the 
experimental design (n = 13 subjects per cell). Male and female subjects 

Insert Table 1 and Figure 1 about here 

did not differ appreciably in IQ (£(1,384) = 1»79, n»s.)» but white 
students scored higher in overall IQ than black students (£(1,384) = 109.45, 
p<C«00l). The overall mean IQ for whites was 111.13 while that for blacks 
was 96.67; the overall difference in mean IQ between the races was thus 
14.46 points. Students of both races generally performed better in the 
presence of a white rather than a black tester (£(1,384) = 17.11, p<.01). 

A significant Atmosphere x Expectation interaction (£(1,384) = 6.5^, 
p<.02) developed from the mean IQs shown in Table 2. These means 

Insert Table 2 about here 

combined the scores of male or female, black or white subjects tested by 
black or white experimenters. In an evaluative atmosphere, students scored 
higher in IQ If they were told they would do well rather than poorly, but 
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this difference was not significant (t^ = 1*15). More specifically, howev/er, 
white males in the ev/aluativ/e atmosphere with a white tester scored 121.46 
in mean IQ following a high expectation but only 110,31 if the tester's 
expectation was low, a significant instructional effect (p<:.05). In 
the gamelike atmosphere, students did best when the tester was critical 
rather than encouraging, the predicted reactance effect ( t_ = 2.47, p<:.002). 

An Atmosphere x Expectation x Sex of Subject x Race of Experimenter 
interaction was also observed (£(1,7"''^ = 2.80, p<:.10), but it was of 
only marginal reliability. While interpretation of a four-factor interaction 
is rather difficult, the pattern of mean IQs shown in Figure 1 is suggestive 
of the followingi The Atmosphere x Expectation interaction \«as strongest 
in the presence of a black tester and was, overall, stronger for males 
than for females. 
Correlates of IQ 

Scores on the socio-economic index were positively correlated with 
IQ for both male (£ = +.39; ^(414) = 6.11, p <r.002) and female = +.20; 
Jt^(414) = 2.98, p<C.02) subjects. In other words, students from more 
advantaged home environments tended to score higher in IQ, 

Emotional and personality correlates of IQ are shown in Table 3, 
On the mood checklist, relaxed and happy mood states, like elation and 

Insert Table 3 about here 

surgency, were negatively related to IQ, as were tense emotional states 
like aggression or unhapp/ states like fatigue and sadness. Concentration 
was positively related to IQ, but only the data for whites were statistically 
significant. In general, though, these relationships were more often 
significant for blacks than for whites and for males than for females. 
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Subject self-ratinqs of performance were positively correlated 
with IQ, but only significantly so for whites. On the personality 
measures, general anxiety was negatively related to IQ for blacks but 
not significantly so for whites. Test anxiety and an "external" or 
fatalistic view of life on the I-E scale were negatively correlated 
with IQ for both blacks and whites. 
Reactance Effects in the Evaluative Atmosphere 

It was suggested in the introduction that examinees in the ego- 
threatening evaluative atmosphere who received a low expectation from an 
opposite-race tester might be motivated to apply themselves to task- 
completion so as to disprove the tester's negative assessment. If it 
occurred, such resistance would be manifested in peak performance following 
a low expectation in the evaluative atmosphere (a reactance effect) rather 
than the otherwise-predicted instructional effect. In Figure 1 it appears 
that the only reactance-type effects which were observed in the evaluative 
atmosphere occurred among black males and white females in the presence 
of a white tester. For white males and bUcU females in the presence of 
a white tester and for all subjects in the presence of a black tester, 
peak performance was observed in the evaluative atmosphere following a 
high rather than a low tester expectation. 

Thus, reactance-type effects occurred in the evaluative atmosphere 
only in the presence of a white tester, thereover, as was explained earlier, 
a significant main effect for Race of Tester (p <.01) as well as a 
marginally significant four-factor interaction involving the race of 
the tester (p<.10) were disclosed in analyses of the IQ data. These 
findings suggested that the data gathered by white and black experi- 
menters should be separated to permit a more detailed analysis. As 
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can be seen in the left ncaif of Figure 1, a black examiner induced the 
same pattern of m^an IQs whether he was working with black males, black 
females, white males, or white females: an instructional effect in the 
evaluative atmosphere and a reactance effect in the garrelike. The U- 
shaped curves are a graphic representation of the Atmosphere x Expectation 
interaction which was mentioned earlier; an analysis of variance on the 
data gathered by a black tester revealed this interaction in significant 
strength (r(l,192) = 5.03, p<.05). 

IQs for white and black subjects faced with a white tester are shown 
in the right half of Figure 1. Perhaps the most striking feature of these 
data is the degree to which the curves for males and females intersect, 
indicating a rather opposite reaction on the part of the two sexes to 
the various tost settings. Among blacks exposed to an evaluative atmos- 
phere, females conformed to the white tester's expectations while males 
resisted this manipulation and did bost when the tester forecast a poor 
performance. Amono whites in the evaluative atmosphere it was males who 
conformed to the tester's expectations and females who resisted. The 
sharply contrasting reactions of male and female subjects to the expec- 
tation treatment thus had an additional racial component in that black 
males resisted while white males conformed, and white fenales resisted 
while black females conformed. Consequently, an analysis of variance 
revealed a significant Expectation x Sex of Subject x Race of Subject 
Interaction (£(1,192) = 4.61, p<:.05). In addition, a marginally reliable 
Atmosphere x Expectation x Sex of Subject interaction confirmed that these 
contrasting responses on the part of male and female subjects were most 
pronounced in the evaluative atmosphere (£(l,192) = 3.40, p<.07). 
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The only reactance effects observed in the evaluative atmosphere, 
then, were for black males in the presence of an opposite-race tester 
and white females in the presence of an opposite-sex but same-race tester. 
How might these phenomena b-- interpreted? 

The overall positive correlation between black is beautiful and IQ 
which was found for black (r = +.22) but not for white {r_ = -.09) males 
ir, Table 3 may provide a clue as to the psychological processes underlying 
the reactance effect shown by black males in the presence of a white tester 
in the evaluative atmosphere. One finds that in this setting the correlation 
between black is beautiful and IQ became still more positive for black 
males (x = +.32). In the gamelike atmosphere with a white tester, by 
contrast, the correlation between black is beautiful and IQ was negative 
for black males (r. = -.30). The difference between these correlations was 
significant (2 = 2.13, p <:.02). Belief in black is beautiful was particularly 
positively correlated with IQ for black males who received a low expectation 
from a white tester in the evaluative atmosphere (£ = +.49; t(ll) = 1.90, 
p <c .10). 

Although no overall relationship between women's liberation and IQ 
was found for fpmale subjects of either race, the results for white females 
did parallel those for black males in certain respects. In the evaluative 
atmosphere uiith a white tester, women's liberation was positively correlated 
with IQ (r = +.21). In the gamelike atmosphere the relationship was 
negative (r = -.49; t(24) = 2.76, p<-.02). The difference between these 
correlations was statistically significant {i = 2.55, p<.Ol). In the 
low expectation condition in the evaluative atmosphere with a white male 
tester the IQ of white females was positively related to belief in women's 
liberation but not significantly so (_r = +.22). 

t' 
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Perhaps being in an evaluative atmosphere with a white male tester 
somehow stimulated the group pride of black males and white females, leading 
to an arousal of reactance motivation when they were challenged by a low 
expectation. If so, the results indicate that for these groups the reactance 
aroused by the tester's challenge was powerful enough to override the 
otherwise general tendency to perform better after receiving a high expec- 
tation in the evaluative atmosphere. There were no comparable findings 
in either the IQ or the personality data for white males or black females. 

DISCUSSION 

It appears that in the non-evaluative gamelike atmosphere test 
performance was facilitated rather than debilitated by the moderate 
anxiety or reactance motivation induced by an examiner's low expectation. 
In the evaluative atmosphere, by contrast, anxiety was by the nature of 
the experimental manipulations induced to be moderately high from the 
start; here, the added stress of an expressed low expectation on the 
part of the tester should have been debilitating, with the exception 
of the reactance effects found for black male and white female subjects 
in the evaluative atmosphere with a white tester, these predictions were 
substantially confirmed, as can be seen in Table 2. 

The correlational data in Table 3 support the hypothesis that a moder- 
ate level of internal arousal induces optimal performance on an intellectual 
task. With less than moderate arousal, the performer will not be motivated 
to take the task seriously and so will focus insufficient attention on 
its completion. Thus, relaxed mood states like elation and surgency as 
well as depressive mood states like fatigue and sadness or a fatalistic, 
external world view were negatively related to IQ in Table 3. With note 
than moderate arousal, however, the performer will be distracted by his 
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internal state and may fail utterly. Thus, aggression in addition to 
general and test anxiety were negativ/ely correlated with IQ in Table 3. 

Even though observed IQ seems to hav/e been reliably altered in response 
to the experimental manipulations, there was one major respect in which 
the data were disappointing: The flexibility of interracial differences 
in IQ was not convincingly demonstrated; there was virtually no overlap 
in mean IQ between the various groups of black and white subjects. 
A replication of the experiment did, however, demonstrate the anticipated 
manipulabillty of interracial IQ differences when socio-economic status 
was treated as an independent variable. 

EXPERIMENT II 

Wethod 

The research was conducted during 1973-74 at two Sacramento junior 
high schools different from those used for Experiment I. The WISC performance 
measures were administered to 104 white and 104 black male students between 
12 and 16 years of age. The variables of test atmosphere, tester expectation, 
race of tester, and race of subject were placed in a 2x2x2x2 factorial 
design. In all other respects, the procedure was identical to that 
utilized in Experiment I. 

RESULTS 

Since Experiment II duplicated procedures employed with male subjects 
in Experiment I, the two sets of data are discussed together in the 
analyses which follow. Hereafter, the results for males in Experiment I 
will be referred to as the 1972-73 experiment and the results for males 
in Experiment II as the 1973-74 experiment. 
Success of the Manipulations 

Across the 1972-73 and 1973-74 experiments, self-ratings of performance 
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mere considerably more positive tor sujbjects in the high than in the low 
expectation conditions (£(1,412) = 273,74, p<.001). Elation, too, was 
greater in the high expectation conditions than in the low (£(l,412) = 
14,36, p<:,01), while anxiety was greater in the e\/aluati\/e than in the 
gamelike atmosphere (£(l,41?) r 5,74, p<.05). 
iq Data 

Shown in Figure 2 are the mean IQ scores from the 1972-73 and 1973-74 
experiments (each point representing 13 subjects). The ways .In which the 

Insert Figure 2 about here 

second experiment replicated the first will be considered before the 
relatively minor differences between these sets of data are discussed. 

In both studies, whites scored higher in IQ tnan blacks (£(1,384) 
= 79.59, pcC.OOl). The overall mean IQ for whites was 112.25 while that 
for blacks was 99.91, an interracial difference of 12.34 points. Students 
of both races performed better in thp presence of a white rather than 
a black tester (F(1,384) = 23.17, p<.01). 

Also in both experiments, a significant Atmosphere x Expectation 
interaction (£(1,384) = 8.74, p <C.01) developed from the mean IQs shown 
in Table 4. These means combined the IQs of black or white male subjects 

Insert Table 4 about here 

tested by white or black experimenters. In an evaluative atmosphere, 
students scored higher in IQ if told they would do well than if told 
they would do poorly (t = 2.24, p <r.05), an instructional effect. In 
the gamelike atmosphere, students did best when the tester was critical 
rather than encouraging (^ = 1.97, p<.05), a reactance effect. The 
generally U-shaped curves in Figure 2 are the graphic representation of 
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the Atmosphere x Expectation interaction shown in Table 1. 

The results for the 1973-74 experiment differed from those gathered 
for male subjects in 1972-73 in just two significant respectsi First, 
subjects in the 1973-74 study had marginally higher iQs than those in 
the 1972-73 experiment (£(1,384) = 3*30, p <:.10). Second, black subjects 
in the 1972-73 experiment who received a low expectation in the evaluative 
atmosphere from a white tester scored 5.00 points above black subjects given 
a high expectation in this setting. In 1973-74, however, black subjects 
who received a high expectation in the evaluative atmosphere from a white 
tester scored 7.92 points above those given a low expectation. Underlying 
both of the foregoing differences between the two experiments may be the 
fact that students in the 1973-74 study were of higher SES (12 points on 
the 100-unit scale) than those in the 1972-73 research (^(414) = 7.85, 
p<.00l). The four schools from which students were sampled each had 
approximately the same proportion of black students (about 25%) f but 
the two schools in which the 1973-74 experiment was conducted were located 
in more prosperous ne.^ghborhoods. 

Consequently, the data f^r male subjects in the 1972-73 and 1973-74 
studies were combined and the population divided into groups above and 
below the median in SES. The results are shown in Table 5 and Figure 3. 

Insert Table 5 and Figure 3 about here 

Clearly, subjects above the median in SES scored substantially higher in 
IQ than those below the median (£(1,384) = 24.41, p<C.Ol). 

All but one of the functions in Figure 2 is U-shaped, indicating 
that both high and low SES subjects displayed the Atmosphere x Expectation 
interaction mentioned earlier, with low SES blacks providing the sole 
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exception. The latter, after receiving a low expectation from a white 
tester in an evaluative atmosphere, scored 2.41 points above their high 
expectation counterparts. Though this is not a significant difference, 
it is the same phenomenon which was observed in the 1972-73 experiment 
and which failed to replicate in 1973-74. apparently because the latter 
population contained a greater proportion of high SES members. 

Table 5 and Figure 3 indicate that high SES black students responded 
to the experimental manipulations in much the same way as did whites of 
either high or low SES. High SES black students did rather well on the 
WiSCt When they were given encouragement by a white male tester in the 
evaluative atmosphere, their mean IQ reached 114.60. a value exceeded by 
whites in only three out of sixteen cells. Two interactions are relevant 
to this finding: Atmosphere x SES x Race of Experimenter (F(l,384) = 4.74, 
p <.05). which seems to have developed from the fact that within each 
racial group the best performance was recorded for high SES students in 
the presence of a white tester in the evaluative atmosphere, and Atmosphere 
X SES X Race of Subject x Race of Experimenter (F(1.384) = 8.76. p<.Ol). 
which is somewhat attributable to the observation that the IQ of high 
SES blacks equaled that of low SES whites in the evaluative atmosphere 
with a white tester and in the gamelike atmosphere with a black tester. 
Wood, motivation f and IQ 

The personality and mood correlates of test performance for males in 
the 1972-73 and 1973-74 experiments will not be described at length, since 
they paralleled the findings shown in Table 3 for Experiment I. Wore 
directly relevant to the hypothesis that internal arousal must be at 
a moderate level for optimal performance are the mean scores for aggression 
and anxiety, shown in Table 6. 
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Insert Table 6 about here 
Anxiety and aggressive motiv/ation were minimal in the relaxed gamelike 
atmosphere when the tester praised the subject's abilities, but both arousal 
states showed an increase when the tester expressed a low expectation in 
this setting. As Table 4 indicates. IQ increased along with the increasing 
motivation. In the more stressful evaluative atmosphere, however, anxiety 
and aggression seem to hsv become excessive when the tester induced the 
subject's state of internal arousal to go beyond the optimal level through 
criticism of the latter's ability. Here, it was the encouragement ..ffered 
hy a high tester expectation which maintained arousal at a moderate level 
and permitted peak performance on the WISC. 
ReactancB Effects Among Low SES Students 

It was noted earlier that low SES black students whose ability was 
criticized by a white tester in the evaluative atmosphere seemed to resist 
the tester's low expectation by outscoring their counterparts in the high 
expectation condition. In Experiment I. this phenomenon was observed among 
white females as well as black males, and it was suggested that the 
group pride of these subjects was challenged by a white male tester to 
a degree not felt by white male or black female subjects. 

The present data indicate, however, that low SES white males may also 
to some extent be challenged by a white tester's low expectation. In 
a separate analysis of the IQ data gathered by a white tester (that is. 
the right half of Figure 3) a marginal Expectation x SES interaction 
emerged (£(1,192) = 2.73. p = .10). In general, high SES subjects performed 
better on the WISC after being encouraged by a high expectation while low 
SES subjects tended to do better following a low expectation. In addition, 
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an Atmosphere x SES interaction (F(1,192) = 4.78, p <r.05) revealed that 
high SES subjects excelled in the evaluative atmosphere while low SES 
subjects oerformed best in the gamelike (especially, it appears in Figure 3, 
if the tester expressed a low expectation). Even if attention is restricted 
to the evaluative atmosphere, however, the Expectation x SES interaction 
persists (F(l,96) = 3.01, p «c.lO). Finally, of course, the Atmosphere x 
Expectation interaction was also found to be significant (F(1,192) = 5.11, 
p <.05). None of the foregoing effects interacted with the race of the 

subject (all Fs < 1). 

When the IQ data gathered by a black tester (the left half of Figure 3) 
were separately analyzed, the Expectation x SES interaction did not appear 
(F < 1). An Atmosphere x SES x Race of Subject interaction (F(1,192) = 4.80, 
p <.05) reflected for the most part the equaxization of high SES black 
and low SES white IQs in the gamelike atmosphere, and the Atmosphere x 
Expectation interaction was also significant (F(1,192) = 4.17, p<:.05). 
Despite these reliable effects, however, it seems that a black tester did 
not motivate or challenge subjects to the same degree as his white counterpart; 
subjects of both races scored lower in IQ in the presence of a black tester. 
This could mean that a black tester was not taken as seriously as a white 
one (that is, subjects did not try as hard to impress him), so the changes 
in IQ induced by his communication of the atmosphere and expectation 
manipulations would have worked off a lower baseline of testes motivation. 
Black students seemed to be inspired to achieve a relatively high IQ in 
the presence of a black tester only when those of high SES were startled 
by a low expectation in what they had been led to believe was a "do your 
thing" gamelike atmosphere. Naturally, it must be kept in mind that just 
one black and one white experimente'r gathered the IQ data. Any effects 
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attributed to race of tester are potentially confounded by the personalities 
of the individual experimenters and their proficiency in administering the 
Wise. Only further research and replication can clarify the mechanisms 
underlying race of tester effects. 

DISCUSSION 

Among both black and white subjects, instructional effects (peak 
performance in response to praise) predominated in the evaluative atmosphere 
while reactance effects (peak performance in response to criticism) 
predominated in the gamelike. This Atmosphere x Expectation interaction 
is interpreted as signifying that, in the ego-threatening evaluative 
atmosphere, internal arousal (one component of which is anxiety) was optimal 
when the subject was reassured by a high tester expectation but became 
excessive and, hence, performance-debilitating when the tester was critical. 
In the relaxed gj'melike atmosphere, by contrast, a tester's low expectation 
served to elevate arousal from a low, insufficiently motivating level to 
a moderate, optimally motivating one and so facilitated test performance. 
In Tables 2 and 4 it can be seen that the range of variation in mean IQ 
.^hich appears to be attributable to this Atmosphere x Expectation interaction 

is aro'-nd 4-5 points. 

For any testee, then, the most facilitative environment seems to be 
one which develops and maintain- internal arousal at an optimal, moderate 
level, avoiding the extremes of anxiety or disinterest. Intr iguingly , 
Ooob and Kirshenbaum (1973), in a study of the effects of frustration and 
aqqressive films on emotional arousal, similarly discovered that performance 
on a digit symbol task was a U-shaped function of arousal, moderately 
elevated levels of blood pressure produced peak performance on the digit 
symbols while normal resting levels or excessively high levels served 
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to debilitate performance. 

It is possible, of course, that some construct other than testee 
motivation might be able to account for the data. Since the experimenters 
overtly communicated their expectations to the subjects, demand effects 
were no doubt operative (Orne, 1962) While such demands could explain 
the instructional effects in the evaluative atmosphere, however, they 
cannot easily account for the reactance effects in the gamelike setting. 
Furthermore, Experimenter^, who administered the -subscales from which a 
given subject's IQ was calculated, was blind as to the subject's prior 
treatment by Experimenter^. All experimenters were kept ignorant of 
the hypotheses until the conclusion of the research, but regardless of 
that precaution Experimenter2 would have been unable to place differential 
demands on the subjects' behavior so as to confirm any predictions. 

A more sophisticated alternative explanation for the results might 
involve Rosenberg's (1965) concept of evaluation apprehension. Perhaps 
certain groups of sub jects— like white females or black males or students 
of low SES— were more likely to discuss the experifnent among themselves 
because they «.ere more fearful of being tested. Armed with prior knowledge 
of the research procedures, they may have resisted the experimenter's 
expectation manipulation as a way of telling him that they were aware of 
his efforts at deceiving them. This alternative does not, however, 
explain why ' .ssip would be most likely to induce such resistance (a 
reactance effect) if a given subject's tester happened to be white rather 
than black, nor does it explain why the data fcr aU groups of subjects- 
not just the most apprehensive— showed reactance effects in the gamelike 
atmosphere. If enough untested assumptions are included, evaluation 
apprehension could become a viable alternative explanation of the 
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data? at present, thouqh. a motivational interpretation seems more 
parsimonious. 

To the extent that reactance effects were observed in Experiments 
I and II, the findings appear to contradict those gathered in the "self- 
fulfilling prophecy" or "Pygmalion" paradigm initiated by Rosenthal and 
Jacobsen (1968). They found that „,hen a teacher had been induced to have 
a hie!, expectation of the abilities of certain randomly-selected students, 
the classroom performance of these students improved; by implication, 
a teacher's Ic^ expectation should debilitate performance. Hou, can the 
results of the present research, in u,hich an overtly-expressed lou, expec- 
tation seemed sometimes to motivate or challenge students to do their best 
on the Wise, be reconciled u,ith those in the self-fulfilling prophecy 
tradition? The answer may lie in the word, overt. Chaikin, Sigler, and 
Oeriega (1974) led undergraduate tutors to believe that a 10-year-old 
interviewee was either "quite bright" (IQ = 130) or "somewhat slow" 
(IQ = 85). It was found that tutors expecting a bright pupil leaned 
toward the interviewee, looked him in the eye, nodded their heads up and 
down, and smiled more frequently than tutors expecting a dull pupil; 
the former were also less likely to exhibit behaviors indicating dislike 
or disapproval, such as leaning backwards. Word, Zanna, and Cooper (1974) 
found that subjects exposed to an interviewer trained to emit standardized 
nonverbal cues of disapprove! made a poorer impression on naive raters 
than those exposed to a nonverbally approving interviewer. So the subtle 
communication of a low expectation may indeed produce the u/ell-known 
Pyqmalion effect. However, an evaluator's low expectation may induce a 
poor performance on the part of examinees in such situations because it 
is so subtly expressed that any challenge to it is short-circuited by 
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the ambiquity in the situation. Research in which sucn things as atmosphere 
and expectation manipulations were either subtly or obviously communicated 
by the tester to the testee should serve to clarify the conditions under 
which one might anticipate a self-fulfilling prophecy rather than a 
reactance or challenge phenomenon. 

The present research suggests that the variables of atmosphere and 
expectation may, when overtly expressed, interact with the subject's 
race and social class so as to have a considerable impact on his or her 
IQ score. If reliable and replicable, such findings would call into 
question Jensen's (1969) assertion that, since differences in the social 
and psychological environments to which white and black Americans are 
routinely exposed appear insufficient to account for interracial differences 
in mean IQ, a genetic explanation of these differences is called for. 

Important questions remain, to be sure. Why, for instance, do 
interracial differences persist across parallel conditions? Even though 
high 5ES blacks performed remarkably well on the WISC when tested by a 
white experimenter in the evaluative atmosphere, why were they still 
outperformed by high SES whites in this same setting? Many answers are 
possible. The students were in the experimental situation for less than 
an hour; the cumulative effects of differential oast experience for black 
and white subjects may not be so easily overcome. Furthermore, even though 
a white tester may, in general, ave been more motivating than his black 
counterpart, he was probably not an unequivocally positive stimulus for 
a black student. 

Since Sacramento is a medium-sized, highly mob e city in which the 
schools participating in the research were at most a few miles and in one 
instance a few blocks apart, it seems rather doubtful that the "high" 
and "low" categories created by the median split on the SES dimension 
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reflect substantially different gene pools. If so, if experience rather 
than heredity can be regarded as the major difference between the high 
and low SES groups, then the res'jlts would seem to imply that interracial 
differences in mean IQ can be erased or possibly even reversed depending 
on certain social-psychological characteristics of the test setting and 
the socio-economic background of the testee. 
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Table 1 
Mean IQ 

for Male and Female Subjects 



Black Subjects White Subjects 

Expec- Black White Black White 

Sex Atmosphere tation Tester Tester Tester Tester 



Evaluative 

lYlale 

Gamelike 



Evaluative 

Ffimal e 

Camelike 



High 98.54 97.D0 111.08 121.46 

Low 91.69 102.00 109.38 110.31 

High 93.23 .96.46 107.15 109.46 

Low 96.23 105.62 109.15 118.46 

High 96.69 97.38 106.76 111.00 

Low 91.69 94.69 104,76 117.53 

High 89.30 101.06 103.15 112*30 

Low 97.69 97.46 110.15 115.92 



^There were 13 subjects per cell. The following critical values for 
assessing the significance of differences between means have been derived 
f'rom procedures for individual comparisons in Hays (1963) t 

17.09 (p<.G02), 12.82 (p<.02), 10.83 (p<.05), 9.12 (p<.10). 
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Table 2 

The Atmosphere x Expectation Interaction 
for Wale and Female Subjects^ 



Atmosphere 

Expectation Evaluative Gamelike 



High 104,99 101.51 

Low 102.75 106.33 



There wer^ 104 subjects per cell. 
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Table 3 

Personality Correlates of I.Q. 
by Race and Sex of Subject 



Black Subjects White Subjects 

Combined Male Female Combined Male Female 



Aaa res 5 1 on 




. 


-.19- 




lla 1 e 

-.35**** 


1 diiCI 1 C 

-. 18* 


Ponrpnt* f* inn 


X HQ 


■ 1 1 

+ . 1 2 


+.06 


+,20--- 


+.31*** 


+.11 


Egot i Sin 


• '* '* '* 




1 /■ 
-. 1 ^ 


12- 


-. 18* 


-.08 


Elation 




-.25 -* 


-.02 


-.10 


-.OA 


-. 18* 


Fat fgue 


-. 17A*,V 


-.2A*** 


-.10 


-.09 


-.10 


-.10 


Sadness 


-.26**** 


-.37**** 


-.20** 


- , 22**^*f* 


-.13 


-.30*** 


Skepticism 


-.13* 


-.07 


-.20** 


-.03 


-.07 


.00 


Surgency 




-.18* 


-.09 


-.02 


-.11 


+.05 


How Well (Self- rate) 


+.08 


+.06 


+.11 


+.18*** 


+.20* 


+.16 


Black is Beautiful^ 


+. 16** 


+. 22** 


+ .07 


-.05 


-.09 


+.03 


General Anxiety^ 


-.26**** 


- . 30A*AVf 


-.19* 


-.06 


-.16 


+.01 


Test Anxiety 


- . 1 8*** 


-.17* 


-.18* 


-.12* 


-.12 


-.10 




-.23**** 


- . 25*** 


-.19* 


-.33*A*5V 


-.26*** 


-.39**** 



These correlations were derived from items on the follow-up questionnai 
which a small number of subjects failed to complete. 

****p <r . 002 
***p <r .02 
**p^.05 
*p <C. 1 0 



35 



-35- 











-H 








3" 




r* 


X 


JT) 


o 


o 


}■»' 


*^ 




s 




'0 






IT 


5= 










'T) 














tD 


(D 








tJI 






O 


t\) 
















(J) 






0 


C 






C 








rr 




h-« 


h-* 




CD 


o 


O 


ro 


o 






n 


cr 




• 


rf 






o 


(0 




Ul 






a 






O 


T) 






o 


















0 






o 


CO 






0) 




























o 


O 



o 



VO 



o 

CD 



o 



O 



03 



O 
VO 



VO 
03 



cn 



o 

CP 



(Si 

o 



o 



VO 



m 

X 
■D 
(D 
O 
cf 
0) 
cf 
H' 
O 



0) 

c 

0) 









—1 










ir 




< 


cf 




(D 




(D 


3 


VO 








O 




J3 






0) 




cf 






■D 


1 


3 






IT 




O 






(D 


LA 


(n 




CD 


hJ 


Q) 


■D 




0) 


(D 








3 






(D 




(D 






-») hJ 




h-* 






O CS 




H- 






hJ 










X 




(D 






3 








0) n 


— 1 








»- X 


0) 








cs "D 


cr 








CS 










Ul o 


(D 


m 






C rf- 




< 






o" a 




0) 
















cs H. 




c 






o o 




0) 






<^ a 




cf 












Is 








< 


rf 








(D 


3 


VO 








O 




(D 






0) 


Ui 


n 








1 


a 






a* 




o 






(D 




cf 




cn 


hJ 


0) 






0) 


(D 




O 




a 











(D 



n 

0) 

c 

0) 
cf 

(D 



O 
0) 
3 
(D 



(D 



J3 
Cf 

3 
O 

(n 
:r 

(D 
(D 



n 
o 
3 
cr 

H- 
(D 

a 



cr 



ERLC 



36 



-36- 



Table 5 



mean IQ for l^lale Subjects 
High or Low in SES 

Black Subjects White Subjects 







Expec- 


Black 


Mhlte 


Black 


White 




Ai" mn c:nhpr p 


ta t ion 


T ester 


Tester 


Tester 


Tester 






High 


99.62 


114.60 


113.72 


126.76 






n r 


6 


10 


18 


17 




Evaluative 


Low 




107 .67 


111.25 


115.36 






n = 


9 


9 


16 


14 


High 


















High 


101.78 


103,08 


108.25 


114.35 






n r 


9 


13 


16 


17 




Gamelike 


Low 




104.67 


112.13 


118.12 






n = 


11 


9 


15 


17 






High 


97.22 


98.12 


108.75 


110.22 






n = 


18 


16 


8 


9 




Evaluative 


Low 


95.47 


100.53 


99.60 


108.50 






n = 


17 


17 


10 


12 


Louj 


















High 


91.59 


97.08 


106.20 


110.44 






n = 


17 


13 


10 


9 




Gamel ike 


Low 


93.00 


104.82 


107.73 


114.22 






n = 


15 


17 


11 


9 



^The following critical values for assessing the significance of differ- 
ences between means have been derived from procedures for individual 
comparisons in Hays (1963): 17.63 (p-:r.002), 12.84 (p-^.02), 10.80, 
(p<: .05), 9.04 (p<: .10). 
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Table 6 

f»1oan Anxiety and Aggression 
for Male Subjects 





Anxiety^ 




Aggression 






Atmosphere 


Atmosphere 


Expectat ion 


Evaluative Gamel ike 


p(diff) 


Evaluative Gamel ike 


p(diff) 


Hiqh 


3.24 2.45 


.02 


2.36 2,39 


n.s. 


Loui 


3.32 3.05 


n . s. 


3.19 2.89 


n.s. 


p(diff) 


n.s. .10 




.05 n.s. 





Atmosphere = 5-^^' df, p.^ .05 

Wctation =6-^5 1/412 df, p^.05 
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ABSTRACT 



Male and female r.tudents bctiueen 12 and 16 years of age completed 
the Wise performance subscalrs in a variety of settings. The variables 
of test atmosphere (evaluative or qamelikc), tester expectation (high 
or low), race of tester (black or white), and race of subject were placed 
in a 2x2x2x2 factorial design. The pattern of mean IQ scores as well 
as mood and personality .iata indicated that test performance was optimal 
at moderate levels of motivational arousal, A replication of the experi- 
ment for male subjects increased cell sizes to the point that socio- 
economic status could be treated as an independent variable in the design. 
When this was done, the results suggested that interracial differences 
in mean IQ might be erased depending upon the social-psychological character- 
istics of the test setting and the socio-economic background of the 
testse. 
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lYlotivation , Race, Social Class, and IQ 

If an IQ test is administered to children attending a racially inte- 
grated school, blacks will generally average from 11 to 15 points below 
whites (though there will be a substantial overlap in the two distributions) • 
Jensen^s (1969) suggested explanation for this phenomenon was that, since 
IQ has a very high heritabil ity , large mean differences between racial 
groups must be predominantly genetic in origin. Environmentalists (e.g*, 
Garcia, 1972) have replied that any such interrf?cial differences are entirely 
attributable ta an alleged bias in the content of IQ tests in favor of 
cultural experiences more readily accessible to middle and upper income 
whites • 

While debates over the heritability of IQ and the potential for culture 
bias in measuring instruments have generated much research and public 
comment, it is also possible to investigate the significance of interracial 
differences in mean IQ by ignoring both the foregoing issues and instead 
examining the social psychology of the test situation itself. Sattler's 
(1970) review of the extensive literature on this topic found support 
for several hypotheses (among others): black children may have generally 
lower achievement motivation than whites? expectations of failure or fear 
of appearing "uppity" may impair the performance of black children when 
they anticipate comparison with whites? the performance of black children 
may be improved by providing a same-race tester? the performance of all 
children may be improved when the tester has a favorable rather than an 
unfavorable expecteition of their ability [when overtly revealed, such 
expectations may result in what Rosenthal (1965) has called an "experimenter'* 
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or instructional effect]. Sincf? black students have in the past usually 
been tested by white examiners in what are often competitive, ego-threatening 
situations, it is conceivable -if black examinees tend to believe that 
practically any white tester will be prejudiced against them—that performance- 
debilitating self-fulfilling prophecies could be set in motion (Rosenthal 
an d JacobsRn, 1968) . 

However, the influence of the test setting on the observed ability 
of black and white testers is not as consistently predictable as the preceding 
paragraph implies. Katz (1970), a prominent researcher in this area, has 
found, rather paradoxically, that the performance of black students often 
improves when they are tested by a white rather than by a black examiner, 
when they are in the presence of white rather than black agemates, or when 
they are to be compared against white rather than black norms. According 
to Katz, this may occur because whites are regarded as evaluators that one 
should try harder to impress or because comparison with white standards 
is more informative for self-evaluations of ability. 

A study with hybrid results was recently reported by McClelland (1974). 
He found that both black and white subjects were more cooperative and 
motivated to achieve when a white rather than a black interviewer asked 
them to complete a battery of itrms from intelligence and personality 
tests. But both groups scored higher in intelligence in the presence of 
a black interviewer, perhaps, as PHcClelland suggests, because the white 
tester stimulated evaluation apprehension along with achievement motivation, 
thus producing "lower intelligence test scores, due to higher anxiety." 

It may in fact be the case that the single construct of testee motivation 
(one element of which is test anxiety) could account for this complex array 
of evidence. Wine (1971) suggested that anxiety elicits "two classes of 
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responsest those related to task completion, which are anxiety reducing, 
and those which interfere with task completion." When anxiety is extremely 
high, the performer's state of internal arousal becomes disruptively 
distracting, which causes the interfering responses to preoominate and 
which in turn debilitates performance. Sarason (1961) found that under 
conditions of ego threat habitually low test anxious subjects will surpass 
those whose performance has been debilitated by high test anxiety. Sarason 
also found, however, that under relaxed, non-threatening conditions, subjects 
low in test anxiety solved fewer anagrams than their habitually high test 
anxious counterparts; presumably, the latter were experiencing only a moderate, 
optimally motivating level of arousal due to the relaxed conditions whereas 
the former were not sufficiently motivated to complete the task. Weiner 
and Samuel (in press) administered Sarason *s anagrams in an ego-threatening 
environment to chronically high test anxious subjects, some of whom were 
led to mislabel their test-induced physiological symptoms of anxiety as 
being due to ti.e side effects of a capsule (placebo) which they had swallowed 
earlier. This group rated itself less anxious and was able to solve more 
anagrams than controls not given the opportunity to mislabel the source 
of their symptoms of internal arousal. 

Such processes could account for the seemingly contradictory effects 
for race of tester on testee performance which have emerged in the literature. 
A white tester may be perceived as 3 rrore powerful evaluator than a black 
counterpart and so will elicit a better performance from testees so long 
as the greater internal arousal associated with his presence goes no higher 
than the optimal, moderate lev/el. Certain environments may, however, induce 
some anxiety or arousal independently of the characteristics of the examiner. 
Under these conditions, the presence of a white evaluator could be excessively 



aruusinq and so miaht result in pRiformance impairrpgnt; instead, a less 
disruptively stimulatinq black pvaluator might be able to elicit a superior 
performance. 

In the present experiment, race of subject and race of tester were 
systematically varied. In addition, there were two test atmospheres such 
that students in one condition were explicitly told they were completing a 
battery of competitive tests while those in another condition thought they 
were working with a set of creative games and playthings. The tester also 
expressed either a high or a low expectation for the subject's probable 
performance. In all cells of this 2x2x2x2 factorial pairing of race of 
tester, race of subject, test atmosphere, and tester expectation, subjects 
completed the performance subscales to the Wechsler Intelligence Scale 
for Children. 

It was anticipated that whites would, overall, score 11-15 points 
higher in IQ than blacks, since this appears to be a stable finding in the 
literature. For subjects of both races, however, it was predicted that in 
the evaluative atmosphere the ego-threatening and competitive nature of 
the instructions would induce a state af fairly high anxiety in testees 
and that if the tester then expressed a high expectation this might reduce 
anxiety slightly to a more moderate level and so facilitate performance; if 
the tester instead expressed a low expectation in the evaluative atmosphere, 
the added stress of his criticisfP should definitely debilitate performance. 
In the gamelike atmosphere, by contrast, the setting was anticipated to be 
so relaxed that testee motivation would be insufficient for optimal performance. 
Here it was predicted that achievement would be facilitated rather than 
debilitated by the moderate anxiety induced by a tester's low expectation. 

It was further speculated that there might be situations in which 
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reactance motivation could override the debilitative effects of high test 
anxiety. Brehm's (1966) theory of reactance states that when people feel 
their freedom of action is being threatened by manipulation or coercion 
they will resist the threat and seek to emphasize their freedom to behave 
oppositely. In the present research, it was felt that if tsstees in the 
ego-threatening evaluative atmosphere were challenged by a low expectation 
on the part of an opposite-race tester, an especially strong desire to 
disprove the tester's negative assessment might lead to an effort at 
suppressing task-interfering responses to permit a resolute concentration 
on task completion. Baron and Ganz (1972) have suggested that reactance 
motivation might be especially likely to be aroused in black students 
confronted by a white evaluator, and Allen, Dubanoski, and Stevenson 
(1966) have reported that among older children criticism from a white 
experimenter was actually more effective than praise in maintaining the 
performance level of black testees. 

To summarize, it was anticipated that in the evaluative atmosphere 
subjects would perform better on the WISC following an expressed high 
expectation on the part of the tester (an instructional effect). In the 
gamelike atmosphere, however, it was predicted that performance would be 
optimal following a tester ' s low expect^.tion ; in the sense that the subject 
u/as predicted to behave oppositely from the tester's overt "demand," this 
might be called a reactance effect, though other motivational states, such 
as anxiety or irritation resulting from the tester's criticism, were 
also expected to contribute to the phenomenon. For students in the 
evaluative atmosphere, it was speculated that an opposite-race tester's 
low expectation might be viewed as a challenge} if so, it might arouse an 
especially strong motivation to discon'firm the tester's negative assessment, leading 
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to a reactance effect rather than the instructions! effect which was 
otherwise predicted for the evaluative atmosphere. Overall, it was 
anticipated that if performance responded as predicted to the manipulation 
of the testee's motivational state, the size of interracial differences in 
mean IQ would be found to be somewhat more flexible than was suggested 
by Jensen's (1969) review of the literature. 

EXPERIMENT I 



Wethori 



During 1972-73, tho WISC performance measures were administered to 
208 black and 208 white junior hiqh and high school students between 12 and 
16 years of age. equally divided by sex. The 2x2x2x2x2 factorial design 
varied test atmosphere (evaluative or gamelike), tester expectation (high 
or low), race of tester (black or white) and race and sex of subject. 

Arriving at an office provided by the school, the subject encountered 
Experimenter^ of a team of two male experimenters. One team consisted of 
two white and the other of two black experimenters. Experimenter^ described 
himself as a representative of Psychology Incorporated, a company which 
manufactures either "tests of intelligence and mental capacity" (evaluative 
atmosphere) or "creative games and playthings" (gamelike atmosphere). In 
the evaluative condition the tester further declared that tne subject's 
performance on the tests would be compared to that of other students at 
the school and against city and nationwide norms. In the gamelike condition. 
Experimenter^ assured the subject that performance on the tasks was the 
subject's "own thing" and that he should relax and take it easy since 
"no one is going to be compared to anyone else here." To supplement these 
„,anipulations, the tester wore a tie and jacket in the evaluative conditions 
but removed the jacket and loosened the tie when the atmosphere was to be 
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gamelike. All experimenters were in their middle or late twenties. 

Experimenter^ revealed that Psvchology Incorporated had reviewed the 
student's grades in "think" courses like math. English, or art as well as 
"action" courses likr. gym. shop, or home economics to arrive at a 
prediction for the subject's performance. Those in the high expectation 
conditions were told they could expect to have an easy time with the tasks 
while those in the low expectation treatments were told they would probeb.y 

have a difficult time. 

The student was then niven the Object Assembly subtest of the WISC 
In the evaluative atmosphere, an imposing interval timer wa. used to score 
performance on the items; in the gamelike atmosphere, the exa.niner used . 
wall clock and explained that he regretted having to time ty^iuities 
but must try to keep things on schedule since other students would be 
arriving later. When the subtest was completed. Experimenter^ reinforced 
the expectation manipulation by announcing that the subject had either done 
rather well, above the average, or rather poorly, below the average. He 

then explained that his partner, .r. . had a few other tests 

(or games) for the student to work with. As he departed. Experimenter^ 
removed the interval timer from the table in the evaluative atmosphere 
or put on his jacket and straightened his tie in the gamelike condition; 
he was then replaced by Experimenter^ . who was blind as to the subject's 
prior treatment. Experimenter, administered the Picture Arrangement. 
Picture Completion. Block Design, and Coding subtests. 

After completing the WISC the subject made a self-rating of performance 
on a T.unit scale running from "Very poorly" to "Very well." Next, he or 
she filled out a mood adjective checklist (Nowlis. 1965). The adjectives 
on the checklist comprise scales for measuring aggression (e.g.. "angry"). 
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anxiety ("clutched up**)» concentration (••engaged in thought^*), egotism 
(••boastful")* elation (••ov/er joyed") , fatigue (••tired"), sadness (••sorry^^), 
skepticism (••suspicious^* ) , and surgency ("playf ul*^) . For each adjectiv/e# 
the subject indicated the degree to which it described his or her feelings 
on a 4-unit s. ale running from •'definitely not" to "definitely." Finally, 
subjects were asked their home address and the occupations of their parents. 
Parents* occupations were referred to the tables of ranked occupational 
categories in Duncan, Featherman, and Duncan (1972). The addresses were 
referred to census tracts as another measure of social class. The combined 
social class index ranged from 0 to 100. 

At a second session some weeks later, subjects completed a group- 
administered questionnaire. This included a fully validated version of 
Rotter's internal-external scale adapted for use with children by Nowicki 
and Strickland (1973). According to Rotter (1966) ••internals" generally 
believe they have control over the events which occur in their lives 
while "externals^' believe their fates are decided by powerful deliberate 
or circumstantial forces beyond their control. Higher scores on the I-E 
scale are associated with greater externality. Also completed was the 
Marlouje-Crowne Social Desirability Scale (Crowne and fiHarlowe, 1960), 
which measures the strength of the subject's need for approval from 
others. In addition, two sub'scales for general and test-specific anxiet^, 
from Janis and Field (1959) were included on the questionnaire. 

The last items on the questionnaire specifically tapped attitudes 
toward women and blacks. One item for each target group asked the subject 
to indicate the frequency with which he or she thought the group had 
encountered discriminatiof. , on a scale running from ••never^* to ••extremely 
often." A second item asked the degree to which the subject felt women 
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and blacks should oppose discrim jnii ion when and if it occurred, from 
"Always should relax and go along'' to "Always stand up aggressively." 
The third item asked for the degree to which the subject agreed with 
whatever he or she felt was meant by "black is beautiful" or "women's 
liberation," respectively* Scores on the second and third items were 
combined for each target group into indices labeled "black is beautiful," 
for attitudes toward black assertiveness, end "women's liberation," for 
attitudes toward female assertiveness* Each of the three items was 
rated on a six-unit scale. 

Since the subjects were students under 16 years old, especial care 
was taken to see that each participant left the first session in a pleasant 
frame of mind. Particularly in the low expectation treatments, the 
experimenters emphasized that they had really wanted to study the effect 
of a person's mood on his or her test performance and that in order to 
accomplish this it had been necessary for the tester to exaggerate some 
of the things he had said about their abilities. Subjects were reassured 
as to the quality and complete confidentiality of their own performance, paid 
$3,00, sworn to secrecy, and released. One index of subject satisfaction 
is the degree to which they maintained silence. A probe for prior knowledge 
was conducted both before and during each debriefing; it did not prove 
necessary to discard any subject for suspicion induced by a prior parti- 
cipant's breach of confidence. 

RESULTS 

Success of the Experimental Manipulations 

Self-ratings of performance were considerably more positive for 
subjects in the high than in the low expectation conditions (£(1,412) = 
235*28, p<.00i). Responses to the mood checklist also tended to confirm 
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the success of the atmosphere as well as the expectation manipulations. 
Elation was greater in the high expectation treatment than in the low 
(£(1,412) = 4.93, p<:.05) and was also greater in the gamelike than in 
the ev/aluativ/e atmosphere (£(1,412) = 19.95, p<C.01). Anxiety was greater 
in the ev/aluati\/e than in the gamelike atmosphere (£(1,412) = 7.88, p<C.Ol). 
IQ Data 

The IQ scores reported below are derived from the four subtests 
administered by Experimenterj* prorated according to procedures in the 
Wise scoring manual ^Wechsler, 1949). The prorating caused calculated 
IQs to be slightly Higher than they would have been if the discarded 
Object Assembly score had instead been included and no prorating applied. 

Table 1 and Figure 1 show the mean IQ measured in each cell of the 
experimental design (n = 13 subjects per cell). Male and female subjects 

Insert Table 1 and Figure 1 about here 

did not differ appreciably in IQ (£(1,384) = 1.79, n.s.), but white 
students scored higher in ov/erall IQ than black students (£(1,384) = 109.45, 
p<C.001). The ov/erall mean IQ for whites was 111.13 while that for blacks 
was 96.67; the overall difference in mean IQ between the races was thus 
14.46 points. Students of both races generally performed better in the 
presence of a white rather than a black tester (£(1,384) = 17.11, p<:.01). 

A significant Atmosphere x Expectation interaction (£(1,384) = 6.50, 
p<.02) developed from the mean IQs shown in Table 2. These means 

Insert Table 2 about here 

combined the scores of male or female, black or white subjects tested by 
black or white experimenters. In an evaluative atmosphere, students scored 
higher in IQ if they were told they would do well rather than poorly, but 
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this difference was not significant (jt = 1.15). fOore specifically, howev/er, 
white males in the evaluativ/e atmosphere with a white tester scored 121.46 
in mean IQ following a high expectation but only 110.31 if the tester's 
expectation was low, a significant instructional effect (p<:.05). In 
the gamelike atmosphere, students did best when the tester was critical 
rather than encouraging, the predicted reactance effect (t^ = 2.47, p <:.002). 

An Atmosphere x Expectation x Sex of Subject x Race of Experimenter 
interaction was also observed (£(1/384) = 2.80, p <:.10), but it was of 
only marginal reliability. While interpretation of a four-factor interaction 
is rather difficult, the pattern of mean IQs shown in Figure 1 is suggestiv/e 
of the followingi The Atmosphere x Expectation interaction was strongest 
in the presence of a black tester and was, overall, stronger for males 
than for females. 
Correlates of IQ 

Scores on the socio-economic index were positively correlated with 
IQ for both male (^ = +.39; t,(4l4) r 6.11, p <r.002) and female (£ = +.20; 
Jt(414) = 2.98, p<'.02) subjects. In other words, students from more 
advantaged home environments tended to score higher in IQ, 

Emotional and personality correlates of IQ are shown in Table 3. 
On the mood checklisti relaxed and happy mood states^ like elation and 

Insert Table 3 about here 

surgency, were negatively related to IQ, as were tense emotional states 
like aggression or unhappy states like fatigue and sachess. Concentration 
was positively related to IQ, but only the data for whites were statistically 
significant. In general, though, these relationships were more often 
significant for blacks than for whites and for males than for females. 
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Subject self-ratinqs of performance u/ere positively correlated 
ith IQ, but only significantly so for whites. On the personality 
easures. general anxiety u,as negatively related to IQ for blacks but 
not significantly so for whites. Test anxiety and an "external" or 
f^.talistic view of life on the I-E bcale were negatively correlated 
with IQ for both blacks and whites. 
Reactance Effects in the Evalua tive Atmosphere 

It was suggested in the introduction that examinees in the ego- 
threatening evaluative atmosphere who received a low expectation from an 
opposite-race tester might be motivated to apply themselves to task- 
completion so as to disprove the tester's negative assessment. If it 
occurred, such resistance would be manifested in peak performance following 
a low expectation in the evaluative atmosphere (a reactance effect) rather 
than the otherwise-predicted instructional effect. In Figure 1 it appears 
that the only reactance-type effects which were observed in the evaluative 
atmosphere occurred among black males and white females in the presence 
of a white tester. For white males and blacJ' females in the presence of 
a white tester and for all subjects in the presence of a black tester, 
peak performance was observed in the evaluative atmosphere following a 
high rather than a low tester expectation. 

Thus, reactance-type effects occurred in the evaluative atmosphere 
only in the presence of a white tester, i^loreover. as was explained earlier, 
a significant main effect for Race of Tester (p <-.0l) as well as a 
marginally significant four-factor interaction involving the race of 
the tester (p<.10) were disclosed in analyses of the IQ data. These 
findings suggested that the data gathered by white and black experi- 
menters should be separated to permit a more detailed analysis. As 
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can be seen in the left ndlf of Figure 1, a black examiner induced the 
same pattern of mpan iQs whether he was working with black males, black 
females, white males, or white females: an instructional effect in the 
evaluative atmosphere and a reactance effect in the gamelike* The U- 
shaped curves are a graphic representation of the Atmosphere x Expectation 
interaction oihich was mentioned earlier; an analysis of variance on the 
data gathered by a black tester revealed this interaction in significant 
strength (r(l,192) = 5.03, p<.05). 

IQs for white and black subjects faced with a white tester are shown 
in the right half of Figure 1. Per^-aps the most striking feature of these 
data is the degree to which the curves for males and females intersect, 
indicating a rather opposite reaction on the part of the two sexes to 
the various test settings. Among blacks expossd to an evaluative atmos- 
phere, females conformed to the white tester's expectations while males 
resisted this manipulation and did best when the tester forecast a poor 
performance. Among whites in the evaluative atmosphere it was males who 
conformed to the tester's expectations and females who resisted. T 3 
sharply contrasting reactions of male and female subjects to the expec- 
tation trsatment thus had an additional racial component in that black 
males resisted while white males conformed, and white femaJes resisted 
while black females conformed. Consequently, an analysis of variance 
revealed a significant Expectation x Sex of Subject x Race of Subject 
interaction {£(1,192) = 4.61, p <:.05). In addition, a marginally reliable 
Atmosphere x Expectation x Sex of Subject interaction confirmed that these 
contrasting responses on the part of male and female subjects were most 
pronounced in the evaluative atmosphere (£(l,i92) r 3.40, p<::.07). 
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The only reactance effects observed in the evaluative atmosphere, 
then, were for black males in the presence of an opposite-race tester 
and white females in the presence of an opposite-sex but same-race tester. 
How might these phenomena be interpreted? 

The overall positive correlation between black is beautiful and IQ 
which was found for black (r_ = +.22) but not for white (r = -.09) males 
in Table 3 may provide a clue as to the psychological processes underlying 
the reactance effect shown by black males in the presence of a white tester 
in the evaluative atmosphere. One finds that in this setting the correlation 
between black is beautiful and IQ became still more positive for black 
males (r = +.32). In the gamelike atmosphere with a white tester, by 
contrast, the correlation between black is beautiful and IQ was negative 
for black males (r = -.30). The difference between these correlations was 
significant (? = 2.13, p ^.02). Belief in black is beautiful was particularly 
positively correlated with IQ for black males who received a low expectation 
from a white tester in the evaluative atmosphere (r. = +.49; t(ll) = 1-90, 
p <C .10) . 

Although no overall relationship between women's liberation and IQ 
u,as found for female subjects of either race, the results for white females 
did parallel those for black males in certain respects. In the evaluative 
atmosphere udth a white tester, women's liberation was positively correlated 
with IQ (r = +.21). In the gamelike atmosphere the relationship was 
negative (r = -.49; t(24) = 2.76, p<-.02). The difference between these 
correlations was statistically significant (Z = 2.55, p<.01). In the 
low expectation condition in the evaluative atmosphere with a white male 
tester the IQ of white females was positively related to belief in women's 
liberation but not significantly so (x = +.22). 
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Perhaps being in an evaluative atmosphere with a white male tester 
somehow stimulated the group pride of black males and white females, leading 
to an arousal of reactance motivation when they were challenged by a low 
expectation. If so, the results indicate that for these groups the reactance 
aroused by the tester's challenge was powerful enough to override the 
otherwise general tendency to perform better after receiving a high expec- 
tation in the evaluative atmosphere. There were no comparable findings 
in either the IQ or the personality data for white males or black females. 

DISCUSSION 

It appears that in the non-evaluative gamelike atmosphere test 
performance was facilitated rather than debilitated by the moderate 
anxiety or reactance motivation induced by an examiner's low expectation. 
In the evaluative atmosphere, by contrast, anxiety was by the nature of 
the experimental manipulations induced to be moderately high from the 
start; here, the added stress of an expressed low expectation on the 
part of the tester should have been debilitating. With the exception 
of the reactance effects found for black male and white female subjects 
in the evaluative atmosphere with a white tester, these predictions were 
substantially confirmed, as can be seen in Table 2. 

The correlational data in Table 3 support the hypothesis that a moder- 
ate level of internal arousal induces optimal performance on an intellectual 
task. With less than moderate arousal, the performer will not be motivated 
to take the task seriously and so will focus insufficient attention on 
its completion. Thus, relaxed mood states like elation and surgency as 
uieil as depressive mood states like fatigue and sadness or a fatalistic, 
external world view were negatively related to IQ in Table 3. With nore 
than moderate arousal, however, tKe performer will be distracted by his 
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internal state and may fail utterly. Thus, aggression in addition to 
general and test anxiety were negatively correlated with IQ in Table 3. 

Even though observed IQ seems to have been reliably altered in response 
to the experimental manipulations, there was one major respect in which 
the data were disappointing: The flexibility of interracial differences 
in IQ was not convincingly demonstrated? there was virtually no overlap 
in mean IQ between the various groups of black and white subjects. 
A replication of the experiment did, however, demonstrate the anticipated 
manipulability of interracial IQ differences when socio-economic status 
was treated as an independent variable. 

EXPERIMENT II 

Method 

The research was conducted during 1973-74 at two Sacramento junior 
high schools different from those used for Experiment I. The WISC performance 
measures were administered to 104 white and 104 black male students between 
12 and 16 years of age. The variables of test atmosphere, tester expectation, 
race of tester, and race of subject were placed in a 2x2x2x2 factorial 
design* In all other respects, the procedure was identical to that 
utilized in Experiment I. 

RESULTS 

Since Experiment II duplicated procedures employed with male subjects 
in Experiment I, the two sets of data are discussed together in the 
analyses which follow. Hereafter, the results for males in Experiment I 
will be referred to as the 1972-73 experiment and the results for males 
in Experiment II as the 1973-74 experiment. 
Success of the Manipulations 

Across the 1972-73 and 1973-74 experiments, self-ratings of performance 
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uuerp considerably more positive rcr subjects in the high than in the low 
expectation conditions (f(1,412) = 273.74, p<.00l). Elation, too, uias 
greater in the high expectation conditions than in the low (£(1,412) = 
14.36, p<:.Ol), while anxiety was greater in the evaluative than in the 
gamelike atmosphere (£(l,41?) = 5.74, p<::.05). 
IQ Data 

Shown in Figure 2 are the mean IQ scores from the 1972-73 and 1973-74 
experiments (each point representing 13 subjects). The ways in which the 

Insert Figure 2 about here 

second experiment replicated the first will be considered before the 
relatively minor differences between these sets of data are discussed. 

In both studies, whites scored higher in IQ than blacks (£(1,384) 
= 79.59, p<C00l). The overall mean IQ for whites was 112.25 while that 
for blacks was 99.91, an interracial difference of 12.34 points. Students 
of both races performed better in the presence of a white rather than 
a black tester (£(l,384) r 23.17, p<.Ol). 

Also in both experiments, a significant Atmosphere x Expectation 
interaction (£(1,384) r 8.74, p <C.Ol) developed from the mean iQs shown 
in Table 4. These means combined t. IQs of black or white male subjects 

Insert Table 4 about here 

tested by white or black experimenters. In an evaluative atmosphere, 
students scored higher in IQ if told they would do well than if told 
they would do poorly (t^ = 2.24, p <C.05), an instructional effect. In 
the gamelike atmosphere, students did best when the tester was critical 
rather than encouraging ( t^ = 1.97, p<.05), a reactance effect. Tne 
generally U-shaped curves in Figure 2 are the graphic representation of 
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the Atmosphere x Expectation interaction shoun in Table !• 

The results for the 1973-74 experiment differed from those gathered 
for mile subjects in 1972-73 in just two significant respectsi First, 
subjects in the 1973-74 study had marginally higher IQs than those in 
the 1972-73 experiment (£(1,384) = 3.30, p <r.lO). Second, black subjects 
in the 1972-73 experiment who received a low expectation in the evaluative 
atmosphere from a white tester scored 5.00 points above black subjects given 
a high expectation in this setting. In 1973-74, however, black subjects 
who received a high expectation in the evaluative atmosphere from a white 
tester scored 7.92 points above those given a low expectation. Underlying 
both of the foregoing differences between the two experiments may be the 
fact that students in the 1973-74 study were of higher SES (12 points on 
the 100-unit scale) than those in the 1972-73 research (^(414) = 7.85, 
p<.00l). The four schools from which students were sampled each had 
approximately the same proportion of black students (about 2b%) , but 
the two schools in which the 1973-74 experiment was conducted were located 
in more prosperous ne.^ghborhoods . 

Consequently, the data for male subjects in the 1972-73 and 1973-74 
studies were combined and the population divided into groups above and 
below the median in SES. The results are shown in Table 5 and Figure 3. 

Insert Table 5 and Figure 3 about here 

Clearly, subjects above the median in SES scored substantially higher in 
IQ than those below the median (£(l,384) = 24.41, p<COl). 

All but one of the functions in Figure 2 is U-shaped, indicating 
that both high and low SES subjects displayed the Atmosphere x Expectation 
interaction mentioned earlier, with low SES blacks providing the sole 



ERLC 



-21- 



ERIC 



exception. The latter, after receiving a low expectation from a white 
tester in an evaluativ/e atmosphere, scored 2.41 points above their high 
expectaticn counterparts. Though this is not ^ significant difference, 
it is the same phenomenon which was observed in the 1972-73 experiment 
and which failed to replicate in 1973-74, apparently because the latter 
population contained a greater proportion of high SES members. 

Taple 5 and Figure 3 indicate that high SES black students responded 
to the experimental manipulations in much the same way as did whites of 
either high or low SES. High SES black students did rather well on the 
WISCj When they were given encouragement by a white male tester in the 
evaluative atmosphere, their mean IQ reached 114.60, a value exceeded by 
whites in only three out of sixteen cells. Two interactions are relevant 
to this finding, Atmosphere x SES x Race of Experimenter (F(l,384) = 4.74, 
p <.05), which seems to have developed from the fact that within each 
racial group the best performance was recorded for high SES students in 
the presence of a white tester in the evaluative atmosphere, and Atmosphere 
X SES X Race of Subject x Race of Experimenter (F(1.384) = 8.76. p<.Ol). 
which is somewhat attributable to the observation that the IQ of high 
SES blacks equaled that of low SES whites in the evaluative atmosphere 
with a white tester and in the gamelike atmosphere with a black tester. 
Woodp (Ylot ivation y and IQ 

The personality and mood correlates of test performance for males in 
the 1972-73 and 1973-74 experiments will not be described at length, since 
they paralleled the findings shown in Table 3 for Experiment I. Wore 
directly relevant to the hypothesis that internal arousal must be at 
a moderate level for optimal performance are the mean scores for aggression 
and anxiety, shown in Table 6. 
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Anxiety and aggressive motivation were minimal in the relaxed gamelike 
atmosphere when the tester praised the subject's abilities, but both arousal 
states showed an increase when the tester expressed a low expectation in 
this setting. As Table 4 indicates. IQ increased along with the increasing 
motivation. In the more stressful evaluative atmosphere, however, anxiety 
and aggression seem to have become excessive when the tester induced the 
subject's state of internal arousal to go beyond the optimal level through 
criticism of the latter's ability. Here, it was the encouragement (.rfered 
hy a high tester expectation which maintained arousal at a moderate level 
and permitted peak performance on the WISC. 
Reactance Effects Among Low SES Students 

It was noted earlier that low SES black students whose ability was 
criticized by a white tester in the evaluative atmosphere seemed to resist 
the tester's low expectation by outscoring their counterparts in the high 
expectation condition. In Experiment I. this phenomenon was observed among 
white females as well as black males, and it was suggested that the 
group pride of these subjects was challenged by a white male tester to 
a degree not felt by white male or black female subjects. 

The present data indicate, however, that low SES white males may also 
to some extent be challenged by a white tester's low expectation. In 
a separate analysis of the IQ data gathered by a white tester (that is. 
the right half of Figure 3) a marginal Expectation x SES interaction 
emerged (F(1.192) = 2.73. p = .10). In general, high SES subjects performed 
better on the WISC after being encouraged by a high expectation while low 
SES subjects tended to do better following a low expectation. In addition. 
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an Atmosphare x SES interaction (F(l,192) = 4.78, p <r.05) revealed that 
high SES subjects excelled in the evaluative atmosphere while lou, SES 
subjects performed best in the gamelike (especially, it appears in Figure 3, 
if the tester expressed a low expectation). Even if attention is restricted 
to the evaluative atmosphere, however, the Expectation x SES interaction 
persists (F(l,96) = 3.01, p <r.lO). Finally, of course, the Atmosphere x 
Expectation interaction was also found to be significant (F(1,192) = 5.11, 
p < .05). None of the foregoing effects interacted with the race of the 

subject (all £s < l) . 

When the IQ data gathered by a black tester (the left half of Figure 3) 
were separately analyzed, the Expectation x SES interaction did not appear 
(F < 1). An Atmosphere x SES x Race of Subject interaction (£(1,192) = 4.80. 
p <.05) reflected for the most part the equalization of high SES black 
and low SES white IQs in the gamelike atmosphere, and the Atmosphere x 
Expectation interaction was also significant (F(l,192) = 4.17, p<:.05). 
Despite these reliable effects, however, it seems that a black tester did 
not motivate or challenge subjects to the same degree as his white counterpart; 
subjects of both races scored lower in IQ in the presence of a black tester. 
This could mean that a black tester was not taken as seriously as a white 
one (that is, subjects did not try as hard to impress him), so the changes 
in IQ induced by his communication of the atmosphere and expectation 
manipulations would have worked off a lower baseline of testee motivation. 
Black students seemed to be inspired to achieve a relatively high IQ in 
the presence of a black tester only when those of high SES were stprtled 
by a low expectation in what they had been led to believe was a "do your 
thing" gamelike atmosphere. Naturally, it must be kept in mind that just 
one black and one white experimente'r gathered the IQ data. Any effects 
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attributed to race of tester are potentially confounded by the personalities 
of the individual experimenters and their proficiency in administering the 
Wise. Only further research and replication can clarify the mechanisms 
underlying race of tester effects. 

DISCUSSION 

Among both black and white subjects, instructional effects (peak 
performance in response to praise) predominated in the evaluative atmosphere 
while reactance effects (peak performance in response to criti::ism) 
predominated in the gamelike. This Atmosphere x Expectation interaction 
is interpreted as signifying that, in the ego-threatening evaluative 
atmosphere, internal arousal (one component of which is anxiety) was optimal 
when the subject was reassured by a high tester expectation but became 
excessive and, hence, performance-debilitating when the tester was critical. 
In the relaxed gamelike atmosphere, by contrast, a tester's low expectation 
served to elevate arousal from a low, insufficiently motivating level to 
a moderate, optimally motivating one and so facilitated test performance. 
In Tables 2 and 4 it can be seen that the range of variation in mean IQ 
uihich appears to be attributable to this Atmosphere x Expectation interaction 

is around 4-5 points. 

For any testee, then, the most facilitative environment seems to be 
one which develops and maintains internal arousal at an optimal, moderate 
level, avoiding the extremes of anxiety or disinterest. Intr iguingly , 
Doob and Kirshenbaum (1973), in a study of the effects of frustration and 
ayqressive films on emotional arousal, similarly discovered that performance 
on a digit symbol task was a U-shaped function of arousal. ModerateJy 
elevated levels of blood pressure produced peak performance on the digit 
symbols while normal resting levels or excessively high levels served 
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to debilitate performance. 

It is possible, of course, that some construct other than testee 
motivation might be able to account for the data. Since the experimenters 
overtly communicated their expectations to the subjects, demand effects 
were no doubt operative (Orne, 1962). 'ilhile such demands could explain 
the instructional effects in the evaluative atmosphere, however, they 
cannot easily account fo" the reactance effects in the gamelike setting. 
Furthermore, Experimenter^, who administered the -subscales from uihich a 
given subject's IQ was calculated, was blind as to the subject's prior 
treatment by Experimenter^. All experimenters were kept ignorant of 
the hypotheses until the conclusion of the research, but regardless of 
that precaution Experimenter^ would have beon unable to place differential 
demands on the subjects' behavior so as to confirm any predictions. 

A more sophisticated alternative explanation for the results might 
involve Rosenberg's (1965) concept of evaluation apprehension. Perhaps 
certain groups of subjects-like white females or black males or students 
of low SES— were more likely to discuss the experiment among themselves 
because they were more fearful of being tested. Armed with prior knowledge 
of the research procedures, they may have resisted the experimenter's 
expectation manipulation as a way of telling him that they were aware of 
his efforts at deceiving them. This alternative does not, however, 
explain why gossip would be most likely to induce such resistance (a 
reactance effect) if a given subject's tester happened to be white rather 
than black, nor does it explain why the data for all groups of subjects- 
not just the most apprehensive-showed reactance effects in the gamelike 
atmosphere. If enough untested assumptions are included, evaluation 
apprehension could become a viable alternative explanation of the 
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dataj at present, thouqh, a motivational interpretation seems more 
parsimon ious. 

To the extent that reactance effects «,ere observed in Experiments 
I and n, the findings appear to contradict those gathered in the "self- 
fulfilling prophecy" or "Pygmalion" paradigm initiated by Rosenthal and 
Jacobsen (1968). They found that «,hen a teacher had been induced to have 
a higf. expectation of the abilities of certain randomly-selected students, 
the Classroom perforn.ance of these students improved; by implication, 
a teacher's lc«i expectation should debilitate performance. Hou, can the 
results of the present research, in «,hich an overtly-expressed loui expec- 
tation seemed sometimes to motivate or challenge students to do their best 
on the Wise, be reconciled uiith those in the self-fulfilling prophecy 
tradition? The ansuier may lie in the u,ord, overt . Chaikin, Sigler, and 
Oerlega (1974) led undergraduate tutors to believe that a 10-year-old 
intervieuiee «,as either "quite bright" (IQ = 130) or "someuihat slou/' 
(IQ = 85). It uias found that tutors expecting a bright pupil leaned 
touiard the intervie«,ee. looked him in the eye, nodded their heads up and 
down, and smiled more frequently than tutors expecting a dull pupil; 
the former mete also less likely to exhibit behaviors indicating dislike 
or disapproval, such as leaninq backwards. Word, Zanna, and Cooper (1974) 
found that subjects exposed to an interviewer trained to emit standardized 
nonverbal cues of disapproval made a poorer impression nn naive raters 
than those exposed to a nonverbally approving interviewer. So the subtle 
communication of a low expectation may indeed produce the well-known 
Pyqmalion effect. However, an evaluator's low expectation may induce a 
poor performance on the part of examinees in such situations because it 
is so subtly expressed that any challenge to it is short-circuited by 
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the ambiquity in the situation. Research in wh. :h sucn things as atmosphere 
and expectation manipulations were either subtly or obviously communicated 
by the tester to the testee should serve to clarify the conditions under 
which one might anticipate a self-fulfilling prophecy rather than a 
reactance or challenge phenomenon. 

The present research suggests that the variables of atmosphere and 
expectation may, when overtly expressed, interact with the subject's 
race and social class so as to have a considerable impact on his or her 
IQ score. If reliable and replicable, such findings would call into 
question Jensen's (1969) assertion that, since differences in the social 
and psychological environments to which white and black Americans are 
routinely exposed appear insufficient to account for interracial differences 
in mean IQ, a genetic explanation of these differences is called for. 

Important questions remain, to be sure. Why, for instance, do 
interracial differences persist across parallel conditions? Even though 
high SES blacks performed remarkably well on the WISE when tested by a 
white experimenter in the evaluative atmosphere, why were they still 
outperformed by high SES whites in this same setting? !Y)any answers are 
possible. The students were in the experimental situation for less than 
an hour; the cumulative effects of differential past experience for black 
and white subjects may not be so easily overcome. Furthermore, even though 
a white tester may, in general, have been more motivating than his black 
counterpart, he was probably not an unequivocally positive stimulus for 
a black student. 

Since Sacramento is a mediurn-sized, highly mobile city in which the 
schools participating in the research were at most a few miles and in one 
instance a few blocks apart, it seems rather doubtful that the "high" 
and "low" categories created by the median split on the SES dimension 
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reflect substantially different gene pools. If so, if experience rather 
than heredity can be regarded as the major difference betu/een the high 
and lou/ SES groups, then the results u/ould seem to imply that interracial 
differences in mean IQ can be erased or possibly even reversed depending 
on certain social-psychological characteristics of the test setting and 
the socio-economic background of the testee. 
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Table 1 



mean IQ 



for (viale and Female Subjects 



Sex 



Atmosphere 



Expec- 
tation 



Black Subjects 



Black 
Tester 



White 
Tester 



White Subjects 



Black 
Tester 



White 
Tester 



Wale 



Evaluative 



Gamelike 



High 
Low 

High 
Low 



98.54 
91.69 



97.00 
102.00 



93.23 .96.46 
96.23 105.62 



111.08 121.46 

109.38 110.31 

107.15 109.46 

109.15 118.46 



Evaluative 



Female 



Gamelike 



High 
Lou; 

High 
Loui 



96.69 
91.69 



97.38 
94.69 



89.30 101.06 
97.69 97.46 



106.78 
104.76 



111.00 
117.53 



103.15 112.30 
110.15 115.92 



There were 13 subjects per cell. The following critical values for 
assessing the significance of differences between means have been deiived 
from procedures for individual comparisons in Hays (1963)t 

17.09 {p<.002), 12.82 (p<:.02), 10.83 (p<:.05), 9.12 (p<:.10). 
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Table 2 

The Atmosphere x Expectation Interaction 
for Wale and Female Subjects^ 

Atmosphere 

Expectation Evaluative Gamelike 

High 104.99 101*51 

Low 102.75 106.33 

^There wer'2 104 subjects per cell. 
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Table 3 

Personality Correlates of I.Q. 
by Race and Sex of Subject 



Black Subjects White Subjects 

Combined Male Female Combined Male Female 



Ann reS5 i on 


• ^ i-'* '* ♦* 






-. 26**** 


i Id 1 c 

- . 35**** 


f RniH 1 
1 wHia 1 ^ 

-.18* 


Conrpnfr^it* \r\n 

\^ IV« C 1 L 1 CI L 1 V./ 1 1 


4- no 


+. 12 


+.06 


+.20*** 


+.31**A 


+ .11 


Egot i Sm 








-.12* 


-.18* 


-.08 


Elat ion 


-. ]k** 


-.25*** 


-.02 


-.10 


-.Ok 


-.18* 


Fatigue 


-.17"** 


-.2A*** 


-.10 


-.09 


-. 10 


-.10 


Sadness 


-.26**** 


-.37**>v* 


-.20** 


- . 22**** 


-.13 


- . 30*** 


Skept icism 


-.13* 


-.07 


-.20** 


-.03 


-.07 


.00 


Surgency 




-.18* 


-.09 


-.02 


-.11 


+.05 


How Veil (Self-rate) 


+.08 


+.06 


+.11 


+. 18*** 


+.20** 


+. 16 


Black is Beautiful^ 


+.16** 


+.22** 


+.07 


-.05 


-.09 


+ .03 


General Anxiety^ 


- . 26**** 


- . 30**** 


-.19* 


-.06 


-.16 


+.01 


Test Anxiety^ 


- . 1 8*** 


-.17* 


-.18* 


-. 12* 


-.12 


-.10 




- . 23**** 


-.25* A* 


-.19'- 




- . 26*** 


-. 3g**5V* 



These correlations were derived from items on the follow-up questionnai 
which a small number of subjects failed to complete. 

****p <r . 002 
***p <r.02 
**p -<C.05 
*p <C. 10 
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Table 5 



Wean IQ for Wale Subjects 
High or Low in SES 

Black Subjects White Subjects 



SES 


Atmo sph ere 


Expec- 
tation 


Black 
Tester 


White 
Tester 


Black 
Tester 


White 
Tester 






High 


99.62 


114.60 


113.72 


126.76 






n = 


8 


10 


18 


17 




Evaluative 


1 mil 
I. U W 


93.22 


107.67 


111.25 


115.36 






n = 


9 


9 


16 


14 


High 




High 


101.78 


103.08 


108.25 


114.35 






n = 


9 


13 


16 


17 




Gamelike 


Low 


107. 73 


104.67 


112.13 


118.12 






n = 


11 


9 


15 


17 






High 


97.22 


98.12 


108.75 


110. 22 






n = 


10 


16 


8 


9 




Evaluative 


Low 


95.47 


100.53 


99.60 


108.50 






n = 


17 


17 


10 


12 


Low 


















High 


91.59 


97.08 


106.20 


110.44 






n r 


17 


13 


10 


9 




Gamelike 


Low 


93.00 


104.82 


107.73 


114.22 






n i 


15 


17 


11 


9 



^The following critical values for assessing the significance of differ- 

enccc between means have been derived from prccRdures for individual 

comparisons in Hays (1963)t 17.63 (p-i:.002), 12.84 (p^ .02), 10.80, 
(p< .05), 9.04 (p<: .10). 
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Table 6 

fviGan Anxiety and Aggression 
for Wale Subjects 





Anxiety^ 




Aggression 






Atmosphere 


Atmosphere 


Expectation 


Eva! uative Gamelike 


p(ciifr) 


Evaluative Gamelike 


p(diff) 


High 


3.24 2.45 


.02 


2.36 2.39 


ntS. 


Loui 


3.32 3.05 


n.s. 


3.19 2.89 


n.s. 


p(diff ) 


n.s. .10 




.05 n.s. 





Vn,osphere = ^/^^^ df . .05 

■^Expectation =^-^^ 1/412 df, p<r.05 
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■- Wean IQ for male in the 1972-73 and 1973-74 experiments. 



